Pesquisa | Portal Regional da BVS

1.

Opioid death projections with AI-based forecasts using social media language.

Matero, Matthew; Giorgi, Salvatore; Curtis, Brenda; Ungar, Lyle H; Schwartz, H Andrew.

NPJ Digit Med ; 6(1): 35, 2023 Mar 08.

Artigo em Inglês | MEDLINE | ID: mdl-36882633

RESUMO

Targeting of location-specific aid for the U.S. opioid epidemic is difficult due to our inability to accurately predict changes in opioid mortality across heterogeneous communities. AI-based language analyses, having recently shown promise in cross-sectional (between-community) well-being assessments, may offer a way to more accurately longitudinally predict community-level overdose mortality. Here, we develop and evaluate, TROP (Transformer for Opiod Prediction), a model for community-specific trend projection that uses community-specific social media language along with past opioid-related mortality data to predict future changes in opioid-related deaths. TOP builds on recent advances in sequence modeling, namely transformer networks, to use changes in yearly language on Twitter and past mortality to project the following year's mortality rates by county. Trained over five years and evaluated over the next two years TROP demonstrated state-of-the-art accuracy in predicting future county-specific opioid trends. A model built using linear auto-regression and traditional socioeconomic data gave 7% error (MAPE) or within 2.93 deaths per 100,000 people on average; our proposed architecture was able to forecast yearly death rates with less than half that error: 3% MAPE and within 1.15 per 100,000 people.

2.

Author Correction: Opioid death projections with AI-based forecasts using social media language.

Matero, Matthew; Giorgi, Salvatore; Curtis, Brenda; Ungar, Lyle H; Schwartz, H Andrew.

NPJ Digit Med ; 6(1): 45, 2023 Mar 17.

Artigo em Inglês | MEDLINE | ID: mdl-36932167

3.

Using Facebook language to predict and describe excessive alcohol use.

Jose, Rupa; Matero, Matthew; Sherman, Garrick; Curtis, Brenda; Giorgi, Salvatore; Schwartz, Hansen Andrew; Ungar, Lyle H.

Alcohol Clin Exp Res ; 46(5): 836-847, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35575955

RESUMO

BACKGROUND: Assessing risk for excessive alcohol use is important for applications ranging from recruitment into research studies to targeted public health messaging. Social media language provides an ecologically embedded source of information for assessing individuals who may be at risk for harmful drinking. METHODS: Using data collected on 3664 respondents from the general population, we examine how accurately language used on social media classifies individuals as at-risk for alcohol problems based on Alcohol Use Disorder Identification Test-Consumption score benchmarks. RESULTS: We find that social media language is moderately accurate (area under the curve = 0.75) at identifying individuals at risk for alcohol problems (i.e., hazardous drinking/alcohol use disorders) when used with models based on contextual word embeddings. High-risk alcohol use was predicted by individuals' usage of words related to alcohol, partying, informal expressions, swearing, and anger. Low-risk alcohol use was predicted by individuals' usage of social, affiliative, and faith-based words. CONCLUSIONS: The use of social media data to study drinking behavior in the general public is promising and could eventually support primary and secondary prevention efforts among Americans whose at-risk drinking may have otherwise gone "under the radar."

Assuntos

Transtornos Relacionados ao Uso de Álcool , Alcoolismo , Mídias Sociais , Consumo de Bebidas Alcoólicas/epidemiologia , Transtornos Relacionados ao Uso de Álcool/epidemiologia , Alcoolismo/diagnóstico , Alcoolismo/epidemiologia , Humanos , Idioma

4.

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality.

Ganesan, Adithya V; Matero, Matthew; Ravula, Aravind Reddy; Vu, Huy; Schwartz, H Andrew.

Proc Conf ; 2021: 4515-4532, 2021 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-34296226

RESUMO

In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than the standard 768+ hidden state sizes of each layer within modern transformer-based language models, limiting the ability to effectively leverage transformers. Here, we provide a systematic study on the role of dimension reduction methods (principal components analysis, factorization techniques, or multi-layer auto-encoders) as well as the dimensionality of embedding vectors and sample sizes as a function of predictive performance. We first find that fine-tuning large models with a limited amount of data pose a significant difficulty which can be overcome with a pre-trained dimension reduction regime. RoBERTa consistently achieves top performance in human-level tasks, with PCA giving benefit over other reduction methods in better handling users that write longer texts. Finally, we observe that a majority of the tasks achieve results comparable to the best performance with just 1 12 of the embedding dimensions.

5.

Autoregressive Affective Language Forecasting: A Self-Supervised Task.

Matero, Matthew; Schwartz, H Andrew.

Proc Int Conf Comput Ling ; 2020: 2913-2923, 2020 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-33927580

RESUMO

Human natural language is mentioned at a specific point in time while human emotions change over time. While much work has established a strong link between language use and emotional states, few have attempted to model emotional language in time. Here, we introduce the task of affective language forecasting - predicting future change in language based on past changes of language, a task with real-world applications such as treating mental health or forecasting trends in consumer confidence. We establish some of the fundamental autoregressive characteristics of the task (necessary history size, static versus dynamic length, varying time-step resolutions) and then build on popular sequence models for words to instead model sequences of language-based emotion in time. Over a novel Twitter dataset of 1,900 users and weekly + daily scores for 6 emotions and 2 additional linguistic attributes, we find a novel dual-sequence GRU model with decayed hidden states achieves best results (r = .66). We make our anonymized dataset as well as task setup and evaluation code available for others to build on.

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA